Fast Average-Case Pattern Matching on Weighted Sequences
نویسندگان
چکیده
منابع مشابه
Fast Average-Case Pattern Matching on Weighted Sequences
A weighted string over an alphabet of size σ is a string in which a set of letters may occur at each position with respective occurrence probabilities. Weighted strings, also known as position weight matrices or uncertain sequences, naturally arise in many contexts. In this article, we study the problem of weighted string matching with a special focus on average-case analysis. Given a weighted ...
متن کاملPattern Matching on Weighted Sequences
Weighted sequences are used extensively as profiles for protein families, in the representation of binding sites and often for the representation of sequences produced by a shotgun sequencing strategy. We present various fundamental pattern matching problems on weighted sequences and their respective algorithms. In addition, we define two matching probabilistic measures and we give algorithms f...
متن کاملDesigning optimal- and fast-on-average pattern matching algorithms
Given a pattern w and a text t, the speed of a pattern matching algorithm over t with regard to w, is the ratio of the length of t to the number of text accesses performed to search w into t. We first propose a general method for computing the limit of the expected speed of pattern matching algorithms, with regard to w, over iid texts. Next, we show how to determine the greatest speed which can...
متن کاملPattern Matching and Consensus Problems on Weighted Sequences and Profiles
We study pattern matching problems on two major representations of uncertain sequences used in molecular biology: weighted sequences (also known as position weight matrices, PWM) and profiles (i.e., scoring matrices). In the simple version, in which only the pattern or only the text is uncertain, we obtain efficient algorithms with theoretically-provable running times using a variation of the l...
متن کاملTwo simple heuristics for the pattern matching on weighted sequences
Weighted sequences are used as profiles for protein families, in the representation of binding sites, and sequences produced by a DNA shotgun sequencing assembly. In this paper we present two simple heuristics for the pattern matching on weighted sequences. One is a simple heuristic which enables a faster validation between a weighted candidate and a weighted text. The other is applying the bad...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Foundations of Computer Science
سال: 2018
ISSN: 0129-0541,1793-6373
DOI: 10.1142/s0129054118430062